A Survey of Binary Similarity and Distance Measures

نویسندگان

  • Seung-Seok Choi
  • Sung-Hyuk Cha
  • Charles C. Tappert
چکیده

The binary feature vector is one of the most common representations of patterns and measuring similarity and distance measures play a critical role in many problems such as clustering, classification, etc. Ever since Jaccard proposed a similarity measure to classify ecological species in 1901, numerous binary similarity and distance measures have been proposed in various fields. Applying appropriate measures results in more accurate data analysis. Notwithstanding, few comprehensive surveys on binary measures have been conducted. Hence we collected 76 binary similarity and distance measures used over the last century and reveal their correlations through the hierarchical clustering technique.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New distance and similarity measures for hesitant fuzzy soft sets

The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

A revised Fuzzy - PROMETHEE method , using Fuzzy Distance and Similarity Measures

PROMETHEE refers to a collection of methods of ranking in the field of multi-criteria decision making. These methods are characterized by conceptual simplicity and practical applicability. However, the nature of phenomena involving decision-making in real world leads us to use fuzzy method of preference ranking. The most common criticism on mathematical ranking procedures is that they tend to d...

متن کامل

HESITANT FUZZY INFORMATION MEASURES DERIVED FROM T-NORMS AND S-NORMS

In this contribution, we first introduce the concept of metrical T-norm-based similarity measure for hesitant fuzzy sets (HFSs) {by using the concept of T-norm-based distance measure}. Then,the relationship of the proposed {metrical T-norm-based} similarity {measures} with the {other kind of information measure, called the metrical T-norm-based} entropy measure {is} discussed. The main feature ...

متن کامل

Correlation Analysis of Binary Similarity and Distance Measures on Different Binary Database Types

Binary similarity and dissimilarity measures are of great importance to pattern recognition and other fields. Here, correlations between pairs of 76 binary similarity and distance measures are studied. Some similarity measures are highly correlated while others are not, and the variability of the correlation can depend on the characteristics of the underlying binary data. To better understand t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009